Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dylib linking issues within the project #3408

Closed
wants to merge 1 commit into from

Conversation

ishitatsuyuki
Copy link
Contributor

While the name is equal to the root, we still should produce a dynamic one; otherwise the link will fail.
Close #3406

Code sample:

[package]
name = "foo"
version = "0.1.0"
authors = []

[dependencies]
# cool stuffs

[lib]
crate-type = ["dylib"]
# [bin] is implicit

While the name is equal to the root, we still should produce a dynamic one; otherwise the link will fail.
@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@alexcrichton
Copy link
Member

Thanks for the PR! The original intention of this logic was to pass -C prefer-dynamic if Cargo detected that a crate was a dependency of the "root crate" being compiled. That's been watered down quite a bit over time, though, and reflecting on this check it doesn't actually make that much sense to me any more...

That being said, I don't think that unconditionally applying -C prefer-dynamic is the right solution for all crates with a crate type of dylib. Most of the time crates have historically wanted a C ABI from that and intentionally don't want to link dynamically to the standard library.

In that sense, could you elaborate more on the use case you have here? Do you intentionally want to link to the standard library, and if so how come?

@ishitatsuyuki
Copy link
Contributor Author

This is the case that I'm building a plugin system; and I leave the name of both the [lib] and [bin] as default.

rustc cannot handle dylibs that aren't compiled with -C prefer-dynamic. See rust-lang/rust#19680 for some error messages.

@ishitatsuyuki
Copy link
Contributor Author

Indeed rust-lang/rust#34909 is the root problem, however I believe the original addition is also for the workaround purpose. Anyway, this is a correct workaround, and probably #1501 can be reverted then too.

@alexcrichton
Copy link
Member

Hm so in general rustc doesn't support plugins very well. If you'd like to work with a plugin system, right now the only real guaranteed way to do that is to work through a C ABI. This PR is unfortunately a breaking change to any crate using dylib which wants to statically link the standard library, which I believe is the majority of projects.

@ishitatsuyuki
Copy link
Contributor Author

I believe dylib compiled without dynamic flag doesn't link to the executable at all(rust-lang/rust#34909); then how can this be a breaking change?

@alexcrichton
Copy link
Member

Oh right yeah but there are many dylibs which aren't linked to Rust executables, but rather compiled as dylibs to be used with dlopen or linked to a C project, for example. (and that's what this'll be breaking)

@ishitatsuyuki
Copy link
Contributor Author

That is what cdylib is for. I feel dylib is more suitable for pure-Rust projects.

@ishitatsuyuki
Copy link
Contributor Author

@alexcrichton here's a ping.

dylib is intended to use with Rust projects; extern crate doesn't work with cdylib. However, without the flag it doesn't link at all. Adding it has no bad effects.

@alexcrichton
Copy link
Member

@ishitatsuyuki yes nowadays cdylib is the correct crate type for that use case, but historically cdylib didn't exist. As a result this would unfortunately still be a breaking change for any crate compiling a dylib which hasn't yet updated to a cdylib.

@jsgf
Copy link
Contributor

jsgf commented Jan 19, 2017

I found this issue/PR looking for a way to make it so that if I build a library with crate-type=["rlib","dylib"] then it will build the library as both .rlib and .so, and all its dependencies in the same way (ie, the dylib has all dynamic dependencies, the rlib has all static).

Is that what's expected here?

@alexcrichton
Copy link
Member

@jsgf I personally fee like that's a separate issue, mind opening a separate tracking issue for that?

@ishitatsuyuki
Copy link
Contributor Author

I agree with @jsgf, but I'd also like to separate the job.

Anyway, I'm pretty sure that this one is safe. @alexcrichton can you give a review? This doesn't break things, since most dylib users have been doing this with RUSTFLAGS.

@alexcrichton
Copy link
Member

@ishitatsuyuki I've given this a review in the sense that we basically cannot merge this as is. Historical crates which should use cdylib today may still be using dylib. As a result this is a breaking change, and if we were to merge it then not nearly enough preparation has been done to canvas the community.

@ishitatsuyuki
Copy link
Contributor Author

@alexcrichton Please give me a specific case that this could break things. Even rustc dynamically links everything including std (or it won't link at all). This PR is solving a broken behavior, not adding a feature.

@alexcrichton
Copy link
Member

A simple hello world should break:

$ cargo new foo
$ cd foo 
# edit Cargo.toml to `crate-type = ["dylib"]`
$ cargo build
$ ldd target/debug/libfoo.so

Before this PR there is no dynamic dependency on libstd, after there is.

@jsgf
Copy link
Contributor

jsgf commented Jan 25, 2017

Is that a breakage? Wouldn't that still work?

@alexcrichton
Copy link
Member

@jsgf it's not breakage in the sense that it still compiles, but it's breakage in how libfoo.so is linked. Project historically expected it to not dynamically link to libstd, but with this PR as-is it would link dynamically. That semantic change would be a breaking change.

@ishitatsuyuki
Copy link
Contributor Author

It's not always correct to block breaking changes; the current behavior is quite annoying (I have to export RUSTFLAGS everytime I switch the workspace).

If you focus on the breakage, please add a crate that is utilizing this behavior.

@alexcrichton
Copy link
Member

@ishitatsuyuki I'm sorry I'm not sure I quite follow. Do you agree that this is a breaking change? Do you think we should ignore that fact and land this anyway?

I agree that the current behavior is unfortunate, but I believe this is a breaking change and we cannot simply ignore breaking changes.

@jsgf
Copy link
Contributor

jsgf commented Feb 13, 2017

@alexcrichton -
I'm struggling to think of a breakage that would occur from converting the linkage to libstd from static to shared. The only things I can think of are so incredibly system-dependent that I'm not sure they'd constitute well-defined supported behaviour anyway.

On the other hand, I can't think of a good reason to link libstd statically. It's very brittle (you either end up with multiple copies of libstd in the process at once, or you end up with linkage errors if it occurs), and... no upside?

Also, I'm not sure that having separate dylib and cdylib is a very good idea in the large; it might be just better to make dylib work as both a Rust or C dependency. At the moment, if you're using dynamic linking and you have a crate that's a dependency for both Rust and C code, you'll get two instances of it: one as dylib and one as cdylib. To work around this you'd need a second crate to present a C ABI interface to the C code which then has a dependency on the Rust ABI crate, which sounds pretty awkward.

(I've implemented generating cdylib crates in our build system when C has a Rust dependency, but it never seems to quite do what I expect - but I haven't really looked at it yet, since we don't have an immediate need for it.)

@ishitatsuyuki
Copy link
Contributor Author

cdylib sucks: it only works when you only load one type of rust library, otherwise globals are broken. The same applies to "static" dylibs, which is broken by design. There's a long road to have good dynamic loading, but we should make it at least "working".

@alexcrichton
Copy link
Member

Ok so I think it may help to step back here a second and see what's going on. Today I believe there are three primary use cases for Rust dynamic libraries (detailed below). All of these are definitely very important use cases for Rust/Cargo to support!

  1. Exporting a C-style (stable ABI) interface. This is where typically you've got a component written in Rust that's being loaded into another system. For example you've got a plugin for another program or this is a native extension for a different language written in Rust. I believe we have a number of users taking this approach, especially on the native extension side of things.

  2. Using dynamic linking to reduce the size overhead of multiple Rust binaries on the system. This is where typically a Rust binary includes all dependencies statically, but dynamic linking can be used to share these dependencies and have the space requirements be O(binaries + dependencies), not O(binaries * dependencies). @jsgf I believe you're in this category, right? AFAIK there are few (if any) active users of this style of using dynamic libraries.

  3. Using dynamic linking to have a replaceable component without recompiling the rest of the system. This is typically what's done in Linux distributions where libraries come as shared objects and then updates can just recompile that one object without recompiling anything else. Unfortunately Rust basically fundamentally doesn't support this. We don't have a stable ABI much less any help for telling you when you break an ABI! AFAIK there are zero users using dylibs in this style.

So long ago Rust only had one crate type, dylib. That one crate type was used as in scenario (2) above. All Rust libraries were dynamic libraries. Later on we added the rlib crate type which meant it was possible to link a Rust crate statically for the first time. This consequently meant that the compiler could find a crate in one of two formats: rlib or dylib. If it found both, it then had to choose which one! Hence, the -C prefer-dynamic flag was introduced. By default the compiler chooses the rlib format for dependencies found in both formats, but -C prefer-dynamic basically says choose dylib if available (modulo some other logic).

So this went on for a time and meant that we could satisfy use cases (1) and (2) above. With (1) you just compile a dylib without -C prefer-dynamic and for (2) you compile everything as a dylib with -C prefer-dynamic. Eventually though we added Cargo, which started taking care of compilations for you. Cargo defaults to compiling libraries as an rlib and then has support for case (1) above as well through crate-type = ["dylib"].

Fast forward again in time and we get to a point where dylib isn't quite well suited for case (1). The dylib format is now overloaded for cases (1) and (2) and ends up exporting too much (at the symbol level) for case (1). At this time we add a new crate type, cdylib. This new crate type is targeted precisely at users of case (1) above, and Cargo comes with support for it as well. The compiler knows that no cdylib will ever be linked through extern crate, so it can optimize based on that.

Ok so why does Cargo pass -C prefer-dynamic at all today? Well it turns out that rustc itself (from way in the beginning) never changed. It's always used a bunch of dynamic libraries, and that hasn't changed even since rlibs were introduced. We also wanted to support plugins (and eventually macros 1.1 plugins) in the compiler. To work correctly, this means that plugins need to be compiled as dylibs and with -C prefer-dynamic because they must link to libstd dynamically. All in all the reason Cargo deals with -C prefer-dynamic at all is purely just to handle compiler plugins up to this point.

So that brings us to today. Cargo supports (1) with dylib and cdylib. Cargo also supports compiler plugins where -C prefer-dynamic is required. Cargo also has what I believe buggy support around passing -C prefer-dynamic related to workspaces. This was likely introduced in one of the many refactorings to enable Cargo to work with workspaces (where originally it never did).

So where did case (2) and case (3) go? Case (3) has been brought up a number of time with distros, but we're not even ready as a language/compiler so I don't think now's the right time to start preparing Cargo to work. Case (2), however, has never had real support in Cargo. Cargo, in general, has never had great support for configuring how dependencies are compiled (other than compiler flags). For example, crates shouldn't be deciding what crate types they are when consumed by others!

My belief is that if we want to support case (2) we need a radically different solution than in this PR. Such a solution would, for example, be global configuration to compile everything with -C prefer-dynamic and --crate-type dylib, effectively ignoring crate-type annotations in the manifest.


Does that help clear things up? @jsgf does that sound like an accurate summary for (2) where I think you find yourself? @ishitatsuyuki do you have a separate case than the three above for dylibs? Or does your use case slot into one of those three and is explained above?

@jsgf
Copy link
Contributor

jsgf commented Feb 15, 2017

@alexcrichton -

Our use-case is currently (2). The goal is to speed up build+link time for the dev cycle, and mostly because its what we do for C++; I haven't measured how much difference it makes for pure Rust code.

(1) is also something I hope we get to soon, initially for things like writing Python extensions in Rust. But I feel there's also an opportunity for incrementally sneaking in Rust reimplementations of pieces of code in otherwise C++ codebases - in that case the Rust is an internal implementation detail, but it may have dependencies on other Rust crates.

The main concern I have is that I don't really see a difference between (1) and (2) in practice, at least for Linux/ELF. The resulting .so files should be the same in terms of what symbols they export and what their dependencies are on other .so files - the only difference might be some Rust-specific metadata/bytecode, but that shouldn't matter at runtime. Attempting to statically link all the dependencies into is both undesireable (we'd want them to be shared among multiple cdylib crates) and incorrect (linking multiple instances of a crate into the address space is not well defined). Basically I think it's a bug to build any kind of shared/dynamic crates without -C prefer-dynamic.

It gets more complex if a single crate is being used for both (1) and (2) at the same time - for example, a crate that's both a dependency for an executable via a purely Rust path, and also a dependency for a C++ library. In that case, you want a single instance of the crate to be linked in, which requires that there only be a single build model for dynamic crates.

@ishitatsuyuki
Copy link
Contributor Author

I'm using dylibs for plugin architecture, just like rustc. The problems with cdylib is that its global aren't shared (due to static linkage) and it cause unwanted behavior when multiple instance of Rust things are loaded.

@alexcrichton
Copy link
Member

@jsgf note that any successful rustc compilation provides a guarantee that nothing is ever linked twice. In that sense if rustc works successfully then you're guaranteed there's no duplication.

The bug, however, can arise if you compile two separate dylibs (for example) and then load them both as plugins into a Python process, for example. Most existing use cases today I believe only have one dylib and hence are immune to this, but if you'd like to support multiple independently-developed Rust plugins then you'll definitely need shared dependencies!

Also, can you confirm/deny whether a global boolean "build everything as dylibs" would work for you? I suspect it would, but I'm not sure if you'd require further granularity of how dependencies are compiled and such.


@ishitatsuyuki it sounds like you're in bucket (2) then, right? The correct course of action is to build everything as dynamic dependencies, not just the plugin itself.

@jsgf
Copy link
Contributor

jsgf commented Feb 15, 2017

note that any successful rustc compilation provides a guarantee that nothing is ever linked twice. In that sense if rustc works successfully then you're guaranteed there's no duplication.

I don't think it can tell, in general.

If I have a top level executable App, written in Rust. It depends on Rust crate A, and a C++ library B, both linked dynamically. Library B also depends on crate A, linked dynamically.

rustc can see all all of App's Rust transitive dependencies, but it can't "see through" B's dependencies to see that there's another dependency on A. In the current scheme, where there's a distinction between dylib and cdylib, A will get linked in twice in two forms. This may or may not cause linker errors, but certainly rustc won't see it. If the executable gets generated, then it's unclear to me how the runtime linker will resolve symbols two the two instances of A. (And if the cdylib version of A has all its dependencies statically linked in, there'll probably be two copies of the standard library too).

If there's just dylib, then a single libA.so will get linked in, which will satisfy both App and B's dependency.

The bug, however, can arise if you compile two separate dylibs (for example) and then load them both as plugins into a Python process, for example. Most existing use cases today I believe only have one dylib and hence are immune to this, but if you'd like to support multiple independently-developed Rust plugins then you'll definitely need shared dependencies!

Yes, if you're using dlopen() to dynamically link a shared dependency, then all this gets deferred to runtime; that's why a any kind of dylib must link it dependencies dynamically. But my point is that it can happen at link time as a result of composing Rust and C++ in ways that are all completely OK in isolation, but break in composition.

Also, can you confirm/deny whether a global boolean "build everything as dylibs" would work for you? I suspect it would, but I'm not sure if you'd require further granularity of how dependencies are compiled and such.

It comes down to:

  • if I'm building a crate statically, the deps can be either static or dynamic (default static is fine)
  • if I'm building a crate dynamically, the deps must be dynamic; static is simply wrong and there should be no option to do otherwise (other than a "I have a very special case and I know what I'm doing" override). (The other exception to this is if you're aggregating several libraries into a single shared object, which still links to the union of its deps in a shared way; but that's not something Rust supports now.)

In general the static/dynamic choice should be global for the whole dependency graph; if it's a per-edge property then you can end up with conflicts where one edge wants a dep as static, and another as dynamic, which would have to be reconciled (or simply fail the build).

I don't think there's a back-compat issue here, because anything that can tell that it's deps are statically vs dynamically linked is doing something very specialized/unusual, or is simply wrong (like having static state that they're relying on being unshared).

(Also interesting: since rustc compiles with relocation-mode=pic by default for static, in principle it should be possible to construct a shared library from a static .rlib without needing to recompile it - depending on how easy it is to transform the rust-specific metadata.)

@alexcrichton
Copy link
Member

@jsgf oh yeah of course yeah the compiler can't see through native libraries, I meant moreso for a crate graph the compiler knows to forbid duplication which may accidentally arise via linking.

if I'm building a crate dynamically, the deps must be dynamic; static is simply wrong and there should be no option to do otherwise

I guess one thing I'd like to agree on is that this is empirically not true of how Rust is used today. There are of course situations that this can go wrong, but this statement isn't true 100% of the time. For example there are production users that rely on the fact that their Rust plugin does not dynamically link to the standard library, and we would break code if we were to change this behavior.

I definitely agree in the abstract that any given graph can go wrong if you mix static/dynamic in bad ways. In general this isn't really a solvable problem per se though, rustc just tries to provide mitigations against shooting yourself in the foot in known bad situations.

I don't think there's a back-compat issue here, because anything that can tell that it's deps are statically vs dynamically linked is doing something very specialized/unusual, or is simply wrong

This is another point I'd like to drive home agreement on. This PR is a breaking change, full stop. The breakage is outlined in this comment, and it's something I don't think we can deny.

Now that's not to say we can't do anything! It just means we need an alternate solution. For example I'd propose a global configuration option that just builds everything as dynamic libraries, which I believes solves the use cases here?

@jsgf
Copy link
Contributor

jsgf commented Feb 16, 2017

For example there are production users that rely on the fact that their Rust plugin does not dynamically link to the standard library, and we would break code if we were to change this behavior.

Could you go into a bit more detail about this? No doubt a failure of imagination on my part, but I can't think of how this could arise (ie, that their code or env knows/cares about how libstd is linked).

@alexcrichton
Copy link
Member

Sure yeah. So the piece that's changing here is whether the standard library is linked dynamically or not. Today it's not (by default with crate-type = ["dylib"]) but with this PR it would be. With this PR that naturally means that at runtime we'd have to find the dylib, which could fail for a number of reasons:

  • Dylibs may be shipped today to different systems, assuming they have no other dependencies. If this shipping process forgot libstd then it wouldn't be able to run on the target. The shipping process, however, could update to find libstd and ship it to.
  • Dylibs may not always be in LD_LIBRARY_PATH (or the equivalent var). That means that running a binary could work today but break after this as an extra entry would need to be made in the env var.
  • The dylib may work for now, but as soon as rustup update (or similar) happens then the dylib would break as the libstd dylib would disappear and/or change.

Does that make sense?

@jsgf
Copy link
Contributor

jsgf commented Feb 17, 2017

Hm, yes I'd forgotten about deployment issues, especially since Rust doesn't have an ABI can't cope with any version skew (even point releases?).

We only ever use shared/dynamic linking for dev cycles where the artifacts are used where they're built; deployable builds are always static (aside from "platform" libraries like libc and libstdc++).

Anyway; I think the immediate topic of this issue is making cargo build dynamic dependencies when building a dynamic crate; that's the actual problem I'm having, since I can only (easily) build things statically, which means the dynamic/static choice for upstream dependencies isn't even available in any form.

@alexcrichton
Copy link
Member

@jsgf yeah sounds right to me! Want to open an issue about customizing the crate type of transitive dependencies?

@jsgf
Copy link
Contributor

jsgf commented Feb 22, 2017

@alexcrichton - Yeah, that's what #3573 is about.

@alexcrichton
Copy link
Member

I'm going to close this for now due to activity, but please feel free to resubmit of course!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants